29 research outputs found
Proposal Flow
Finding image correspondences remains a challenging problem in the presence
of intra-class variations and large changes in scene layout.~Semantic flow
methods are designed to handle images depicting different instances of the same
object or scene category. We introduce a novel approach to semantic flow,
dubbed proposal flow, that establishes reliable correspondences using object
proposals. Unlike prevailing semantic flow approaches that operate on pixels or
regularly sampled local regions, proposal flow benefits from the
characteristics of modern object proposals, that exhibit high repeatability at
multiple scales, and can take advantage of both local and geometric consistency
constraints among proposals. We also show that proposal flow can effectively be
transformed into a conventional dense flow field. We introduce a new dataset
that can be used to evaluate both general semantic flow techniques and
region-based approaches such as proposal flow. We use this benchmark to compare
different matching algorithms, object proposals, and region features within
proposal flow, to the state of the art in semantic flow. This comparison, along
with experiments on standard datasets, demonstrates that proposal flow
significantly outperforms existing semantic flow methods in various settings
SFNet: Learning Object-aware Semantic Correspondence
We address the problem of semantic correspondence, that is, establishing a
dense flow field between images depicting different instances of the same
object or scene category. We propose to use images annotated with binary
foreground masks and subjected to synthetic geometric deformations to train a
convolutional neural network (CNN) for this task. Using these masks as part of
the supervisory signal offers a good compromise between semantic flow methods,
where the amount of training data is limited by the cost of manually selecting
point correspondences, and semantic alignment ones, where the regression of a
single global geometric transformation between images may be sensitive to
image-specific details such as background clutter. We propose a new CNN
architecture, dubbed SFNet, which implements this idea. It leverages a new and
differentiable version of the argmax function for end-to-end training, with a
loss that combines mask and flow consistency with smoothness terms.
Experimental results demonstrate the effectiveness of our approach, which
significantly outperforms the state of the art on standard benchmarks.Comment: cvpr 2019 oral pape
Proposal Flow: Semantic Correspondences from Object Proposals
Finding image correspondences remains a challenging problem in the presence
of intra-class variations and large changes in scene layout. Semantic flow
methods are designed to handle images depicting different instances of the same
object or scene category. We introduce a novel approach to semantic flow,
dubbed proposal flow, that establishes reliable correspondences using object
proposals. Unlike prevailing semantic flow approaches that operate on pixels or
regularly sampled local regions, proposal flow benefits from the
characteristics of modern object proposals, that exhibit high repeatability at
multiple scales, and can take advantage of both local and geometric consistency
constraints among proposals. We also show that the corresponding sparse
proposal flow can effectively be transformed into a conventional dense flow
field. We introduce two new challenging datasets that can be used to evaluate
both general semantic flow techniques and region-based approaches such as
proposal flow. We use these benchmarks to compare different matching
algorithms, object proposals, and region features within proposal flow, to the
state of the art in semantic flow. This comparison, along with experiments on
standard datasets, demonstrates that proposal flow significantly outperforms
existing semantic flow methods in various settings.Comment: arXiv admin note: text overlap with arXiv:1511.0506
Deformable Kernel Networks for Joint Image Filtering
Joint image filters are used to transfer structural details from a guidance
picture used as a prior to a target image, in tasks such as enhancing spatial
resolution and suppressing noise. Previous methods based on convolutional
neural networks (CNNs) combine nonlinear activations of spatially-invariant
kernels to estimate structural details and regress the filtering result. In
this paper, we instead learn explicitly sparse and spatially-variant kernels.
We propose a CNN architecture and its efficient implementation, called the
deformable kernel network (DKN), that outputs sets of neighbors and the
corresponding weights adaptively for each pixel. The filtering result is then
computed as a weighted average. We also propose a fast version of DKN that runs
about seventeen times faster for an image of size 640 x 480. We demonstrate the
effectiveness and flexibility of our models on the tasks of depth map
upsampling, saliency map upsampling, cross-modality image restoration, texture
removal, and semantic segmentation. In particular, we show that the weighted
averaging process with sparsely sampled 3 x 3 kernels outperforms the state of
the art by a significant margin in all cases.Comment: arXiv admin note: substantial text overlap with arXiv:1903.11286
(IJCV accepted